Overview

Dataset statistics

Number of variables49
Number of observations2751
Missing cells28860
Missing cells (%)21.4%
Duplicate rows5
Duplicate rows (%)0.2%
Total size in memory3.6 MiB
Average record size in memory1.3 KiB

Variable types

Categorical32
DateTime2
Numeric14
Boolean1

Dataset

DescriptionJHB_Aurum_009 - Quality-corrected harmonized data
CreatorRP2 Clinical Data Quality Team
AuthorQuality-Checked Data
URLHEAT Research Projects

Variable descriptions

Age (at enrolment)Patient age at study enrollment
SexBiological sex
RaceRacial/ethnic group
CD4 cell count (cells/µL)CD4+ T lymphocyte count (missing codes removed)
HIV viral load (copies/mL)HIV RNA copies per mL (missing codes removed)
Antiretroviral Therapy StatusCurrent ART status
BMI (kg/m²)Body Mass Index (extreme values removed)
Waist circumference (cm)Waist circumference (corrected from mm to cm)
weight_kgBody weight in kilograms
height_mHeight in meters
Hematocrit (%)Hematocrit (zero values removed)
hemoglobin_g_dLHemoglobin concentration
White blood cell count (×10³/µL)Total WBC count
Red blood cell count (×10⁶/µL)Total RBC count
Platelet count (×10³/µL)Platelet count (missing codes removed)
MCV (MEAN CELL VOLUME)Mean corpuscular volume
mch_pgMean corpuscular hemoglobin
mchc_g_dLMean corpuscular hemoglobin concentration
RDWRed cell distribution width
Lymphocyte count (×10⁹/L)Lymphocyte absolute count (corrected labeling)
Neutrophil count (×10⁹/L)Neutrophil absolute count (corrected labeling)
Monocyte count (×10⁹/L)Monocyte absolute count (corrected labeling)
Eosinophil count (×10⁹/L)Eosinophil absolute count (corrected labeling)
Basophil count (×10⁹/L)Basophil absolute count (corrected labeling)
ALT (U/L)Alanine aminotransferase (missing codes removed)
AST (U/L)Aspartate aminotransferase
Alkaline phosphatase (U/L)Alkaline phosphatase
Total bilirubin (mg/dL)Total bilirubin
Albumin (g/dL)Serum albumin
Total protein (g/dL)Total serum protein
creatinine_umol_LSerum creatinine
creatinine clearanceEstimated creatinine clearance
Sodium (mEq/L)Serum sodium
Potassium (mEq/L)Serum potassium
fasting_glucose_mmol_LFasting blood glucose
total_cholesterol_mg_dLTotal cholesterol
hdl_cholesterol_mg_dLHDL cholesterol
ldl_cholesterol_mg_dLLDL cholesterol
Triglycerides (mg/dL)Triglycerides
systolic_bp_mmHgSystolic blood pressure
diastolic_bp_mmHgDiastolic blood pressure
heart_rate_bpmHeart rate (zero values removed)
Respiratory rate (breaths/min)Respiratory rate
Oxygen saturation (%)Oxygen saturation
body_temperature_celsiusBody temperature
climate_daily_mean_tempDaily mean temperature
climate_daily_max_tempDaily maximum temperature
climate_temp_anomalyTemperature anomaly from baseline
climate_heat_day_p90Heat day indicator (>90th percentile)
climate_heat_stress_indexHeat stress index
cd4_correction_appliedQuality flag: CD4 missing codes removed
final_comprehensive_fix_appliedQuality flag: Comprehensive corrections applied
waist_circ_unit_correction_appliedQuality flag: Waist circ unit corrected
sa_biomarker_standardsSouth African biomarker reference standards

Alerts

study_source has constant value "JHB_Aurum_009"Constant
latitude has constant value "-25.7479"Constant
longitude has constant value "28.2293"Constant
jhb_subregion has constant value "Eastern_JHB"Constant
city has constant value "Johannesburg"Constant
province has constant value "Gauteng"Constant
country has constant value "South Africa"Constant
Country has constant value "South Africa"Constant
Clinical Study ID has constant value "Tholimpilo_HIV_Linkage_Study"Constant
Location of study follow-up has constant value "Aurum Institute - Multi-site Gauteng and Limpopo"Constant
coordinate_source has constant value "JHB_Aurum_009"Constant
coordinate_precision has constant value "high"Constant
geographic_source has constant value "harmonized_datasets"Constant
HIV_status has constant value "Positive"Constant
johannesburg_metro_valid has constant value "1.0"Constant
study_site_location has constant value "Tembisa/East Rand (Aurum Institute)"Constant
climate_p90_threshold has constant value "28.409"Constant
climate_p95_threshold has constant value "29.704"Constant
climate_p99_threshold has constant value "31.797"Constant
sa_biomarker_standards has constant value "1.0"Constant
final_comprehensive_fix_applied has constant value "1.0"Constant
total_protein_extreme_flag has constant value "0.0"Constant
dphru_053_final_corrections_applied has constant value "0.0"Constant
ezin_002_final_corrections_applied has constant value "0.0"Constant
quality_harmonization_version has constant value "2.0"Constant
waist_circ_unit_correction_applied has constant value "False"Constant
Dataset has 5 (0.2%) duplicate rowsDuplicates
CD4 cell count (cells/µL) is highly overall correlated with cd4_correction_appliedHigh correlation
cd4_correction_applied is highly overall correlated with CD4 cell count (cells/µL)High correlation
climate_14d_mean_temp is highly overall correlated with climate_30d_mean_temp and 11 other fieldsHigh correlation
climate_30d_mean_temp is highly overall correlated with climate_14d_mean_temp and 11 other fieldsHigh correlation
climate_7d_max_temp is highly overall correlated with climate_14d_mean_temp and 7 other fieldsHigh correlation
climate_7d_mean_temp is highly overall correlated with climate_14d_mean_temp and 11 other fieldsHigh correlation
climate_daily_max_temp is highly overall correlated with climate_14d_mean_temp and 11 other fieldsHigh correlation
climate_daily_mean_temp is highly overall correlated with climate_14d_mean_temp and 12 other fieldsHigh correlation
climate_daily_min_temp is highly overall correlated with climate_14d_mean_temp and 11 other fieldsHigh correlation
climate_heat_day_p90 is highly overall correlated with climate_14d_mean_temp and 12 other fieldsHigh correlation
climate_heat_day_p95 is highly overall correlated with climate_14d_mean_temp and 12 other fieldsHigh correlation
climate_heat_stress_index is highly overall correlated with climate_14d_mean_temp and 11 other fieldsHigh correlation
climate_season is highly overall correlated with climate_14d_mean_temp and 14 other fieldsHigh correlation
climate_standardized_anomaly is highly overall correlated with climate_daily_mean_temp and 4 other fieldsHigh correlation
climate_temp_anomaly is highly overall correlated with climate_heat_day_p90 and 6 other fieldsHigh correlation
month is highly overall correlated with climate_heat_day_p90 and 4 other fieldsHigh correlation
season is highly overall correlated with climate_14d_mean_temp and 13 other fieldsHigh correlation
year is highly overall correlated with climate_14d_mean_temp and 10 other fieldsHigh correlation
climate_heat_day_p90 is highly imbalanced (69.4%)Imbalance
climate_heat_day_p95 is highly imbalanced (69.4%)Imbalance
cd4_correction_applied is highly imbalanced (85.9%)Imbalance
CD4 cell count (cells/µL) has 533 (19.4%) missing valuesMissing
HIV viral load (copies/mL) has 2461 (89.5%) missing valuesMissing
climate_daily_mean_temp has 1616 (58.7%) missing valuesMissing
climate_daily_max_temp has 1616 (58.7%) missing valuesMissing
climate_daily_min_temp has 1616 (58.7%) missing valuesMissing
climate_7d_mean_temp has 1616 (58.7%) missing valuesMissing
climate_7d_max_temp has 1616 (58.7%) missing valuesMissing
climate_14d_mean_temp has 1616 (58.7%) missing valuesMissing
climate_30d_mean_temp has 1616 (58.7%) missing valuesMissing
climate_temp_anomaly has 1616 (58.7%) missing valuesMissing
climate_standardized_anomaly has 1616 (58.7%) missing valuesMissing
climate_heat_day_p90 has 1616 (58.7%) missing valuesMissing
climate_heat_day_p95 has 1616 (58.7%) missing valuesMissing
climate_heat_stress_index has 1616 (58.7%) missing valuesMissing
climate_p90_threshold has 1616 (58.7%) missing valuesMissing
climate_p95_threshold has 1616 (58.7%) missing valuesMissing
climate_p99_threshold has 1616 (58.7%) missing valuesMissing
climate_season has 1616 (58.7%) missing valuesMissing
HIV viral load (copies/mL) has 246 (8.9%) zerosZeros

Reproduction

Analysis started2025-11-24 22:05:33.899395
Analysis finished2025-11-24 22:05:42.186232
Duration8.29 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

study_source
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.1 KiB
JHB_Aurum_009
2751 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters35763
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJHB_Aurum_009
2nd rowJHB_Aurum_009
3rd rowJHB_Aurum_009
4th rowJHB_Aurum_009
5th rowJHB_Aurum_009

Common Values

ValueCountFrequency (%)
JHB_Aurum_0092751
100.0%

Length

2025-11-25T00:05:42.207802image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.239929image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
jhb_aurum_0092751
100.0%

Most occurring characters

ValueCountFrequency (%)
_5502
15.4%
u5502
15.4%
05502
15.4%
J2751
7.7%
H2751
7.7%
B2751
7.7%
A2751
7.7%
r2751
7.7%
m2751
7.7%
92751
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11004
30.8%
Uppercase Letter11004
30.8%
Decimal Number8253
23.1%
Connector Punctuation5502
15.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
J2751
25.0%
H2751
25.0%
B2751
25.0%
A2751
25.0%
Lowercase Letter
ValueCountFrequency (%)
u5502
50.0%
r2751
25.0%
m2751
25.0%
Decimal Number
ValueCountFrequency (%)
05502
66.7%
92751
33.3%
Connector Punctuation
ValueCountFrequency (%)
_5502
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin22008
61.5%
Common13755
38.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
u5502
25.0%
J2751
12.5%
H2751
12.5%
B2751
12.5%
A2751
12.5%
r2751
12.5%
m2751
12.5%
Common
ValueCountFrequency (%)
_5502
40.0%
05502
40.0%
92751
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII35763
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_5502
15.4%
u5502
15.4%
05502
15.4%
J2751
7.7%
H2751
7.7%
B2751
7.7%
A2751
7.7%
r2751
7.7%
m2751
7.7%
92751
7.7%
Distinct447
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Memory size43.0 KiB
Minimum2013-03-14 00:00:00
Maximum2015-08-01 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-11-25T00:05:42.278673image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:42.333825image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

year
Categorical

High correlation 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size169.3 KiB
2014.0
1677 
2013.0
1073 
2015.0
 
1

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters16506
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row2014.0
2nd row2014.0
3rd row2014.0
4th row2014.0
5th row2013.0

Common Values

ValueCountFrequency (%)
2014.01677
61.0%
2013.01073
39.0%
2015.01
 
< 0.1%

Length

2025-11-25T00:05:42.383799image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.417961image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2014.01677
61.0%
2013.01073
39.0%
2015.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
05502
33.3%
22751
16.7%
12751
16.7%
.2751
16.7%
41677
 
10.2%
31073
 
6.5%
51
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number13755
83.3%
Other Punctuation2751
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05502
40.0%
22751
20.0%
12751
20.0%
41677
 
12.2%
31073
 
7.8%
51
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common16506
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05502
33.3%
22751
16.7%
12751
16.7%
.2751
16.7%
41677
 
10.2%
31073
 
6.5%
51
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05502
33.3%
22751
16.7%
12751
16.7%
.2751
16.7%
41677
 
10.2%
31073
 
6.5%
51
 
< 0.1%

month
Real number (ℝ)

High correlation 

Distinct12
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9465649
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:42.451964image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median7
Q310
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9715711
Coefficient of variation (CV)0.42777561
Kurtosis-1.0043626
Mean6.9465649
Median Absolute Deviation (MAD)2
Skewness-0.30693242
Sum19110
Variance8.8302346
MonotonicityNot monotonic
2025-11-25T00:05:42.487612image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
10413
15.0%
7373
13.6%
9321
11.7%
8273
9.9%
2250
9.1%
5216
7.9%
11215
7.8%
4205
7.5%
6199
7.2%
3157
 
5.7%
Other values (2)129
 
4.7%
ValueCountFrequency (%)
162
 
2.3%
2250
9.1%
3157
 
5.7%
4205
7.5%
5216
7.9%
6199
7.2%
7373
13.6%
8273
9.9%
9321
11.7%
10413
15.0%
ValueCountFrequency (%)
1267
 
2.4%
11215
7.8%
10413
15.0%
9321
11.7%
8273
9.9%
7373
13.6%
6199
7.2%
5216
7.9%
4205
7.5%
3157
 
5.7%

season
Categorical

High correlation 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size169.3 KiB
Spring
949 
Winter
845 
Autumn
578 
Summer
379 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters16506
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSummer
2nd rowAutumn
3rd rowWinter
4th rowAutumn
5th rowAutumn

Common Values

ValueCountFrequency (%)
Spring949
34.5%
Winter845
30.7%
Autumn578
21.0%
Summer379
 
13.8%

Length

2025-11-25T00:05:42.614834image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.651116image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
spring949
34.5%
winter845
30.7%
autumn578
21.0%
summer379
 
13.8%

Most occurring characters

ValueCountFrequency (%)
n2372
14.4%
r2173
13.2%
i1794
10.9%
u1535
9.3%
t1423
8.6%
m1336
8.1%
S1328
8.0%
e1224
7.4%
p949
5.7%
g949
5.7%
Other values (2)1423
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter13755
83.3%
Uppercase Letter2751
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n2372
17.2%
r2173
15.8%
i1794
13.0%
u1535
11.2%
t1423
10.3%
m1336
9.7%
e1224
8.9%
p949
6.9%
g949
6.9%
Uppercase Letter
ValueCountFrequency (%)
S1328
48.3%
W845
30.7%
A578
21.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16506
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n2372
14.4%
r2173
13.2%
i1794
10.9%
u1535
9.3%
t1423
8.6%
m1336
8.1%
S1328
8.0%
e1224
7.4%
p949
5.7%
g949
5.7%
Other values (2)1423
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII16506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n2372
14.4%
r2173
13.2%
i1794
10.9%
u1535
9.3%
t1423
8.6%
m1336
8.1%
S1328
8.0%
e1224
7.4%
p949
5.7%
g949
5.7%
Other values (2)1423
8.6%

latitude
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size174.6 KiB
-25.7479
2751 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters22008
Distinct characters7
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-25.7479
2nd row-25.7479
3rd row-25.7479
4th row-25.7479
5th row-25.7479

Common Values

ValueCountFrequency (%)
-25.74792751
100.0%

Length

2025-11-25T00:05:42.695330image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.729447image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
25.74792751
100.0%

Most occurring characters

ValueCountFrequency (%)
75502
25.0%
-2751
12.5%
22751
12.5%
52751
12.5%
.2751
12.5%
42751
12.5%
92751
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16506
75.0%
Dash Punctuation2751
 
12.5%
Other Punctuation2751
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
75502
33.3%
22751
16.7%
52751
16.7%
42751
16.7%
92751
16.7%
Dash Punctuation
ValueCountFrequency (%)
-2751
100.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common22008
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
75502
25.0%
-2751
12.5%
22751
12.5%
52751
12.5%
.2751
12.5%
42751
12.5%
92751
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII22008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
75502
25.0%
-2751
12.5%
22751
12.5%
52751
12.5%
.2751
12.5%
42751
12.5%
92751
12.5%

longitude
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size171.9 KiB
28.2293
2751 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters19257
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28.2293
2nd row28.2293
3rd row28.2293
4th row28.2293
5th row28.2293

Common Values

ValueCountFrequency (%)
28.22932751
100.0%

Length

2025-11-25T00:05:42.764574image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.798233image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
28.22932751
100.0%

Most occurring characters

ValueCountFrequency (%)
28253
42.9%
82751
 
14.3%
.2751
 
14.3%
92751
 
14.3%
32751
 
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16506
85.7%
Other Punctuation2751
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
28253
50.0%
82751
 
16.7%
92751
 
16.7%
32751
 
16.7%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common19257
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
28253
42.9%
82751
 
14.3%
.2751
 
14.3%
92751
 
14.3%
32751
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII19257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
28253
42.9%
82751
 
14.3%
.2751
 
14.3%
92751
 
14.3%
32751
 
14.3%

jhb_subregion
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size182.7 KiB
Eastern_JHB
2751 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters30261
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEastern_JHB
2nd rowEastern_JHB
3rd rowEastern_JHB
4th rowEastern_JHB
5th rowEastern_JHB

Common Values

ValueCountFrequency (%)
Eastern_JHB2751
100.0%

Length

2025-11-25T00:05:42.830632image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.862489image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
eastern_jhb2751
100.0%

Most occurring characters

ValueCountFrequency (%)
E2751
9.1%
a2751
9.1%
s2751
9.1%
t2751
9.1%
e2751
9.1%
r2751
9.1%
n2751
9.1%
_2751
9.1%
J2751
9.1%
H2751
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16506
54.5%
Uppercase Letter11004
36.4%
Connector Punctuation2751
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a2751
16.7%
s2751
16.7%
t2751
16.7%
e2751
16.7%
r2751
16.7%
n2751
16.7%
Uppercase Letter
ValueCountFrequency (%)
E2751
25.0%
J2751
25.0%
H2751
25.0%
B2751
25.0%
Connector Punctuation
ValueCountFrequency (%)
_2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin27510
90.9%
Common2751
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E2751
10.0%
a2751
10.0%
s2751
10.0%
t2751
10.0%
e2751
10.0%
r2751
10.0%
n2751
10.0%
J2751
10.0%
H2751
10.0%
B2751
10.0%
Common
ValueCountFrequency (%)
_2751
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII30261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E2751
9.1%
a2751
9.1%
s2751
9.1%
t2751
9.1%
e2751
9.1%
r2751
9.1%
n2751
9.1%
_2751
9.1%
J2751
9.1%
H2751
9.1%

city
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size185.4 KiB
Johannesburg
2751 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters33012
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJohannesburg
2nd rowJohannesburg
3rd rowJohannesburg
4th rowJohannesburg
5th rowJohannesburg

Common Values

ValueCountFrequency (%)
Johannesburg2751
100.0%

Length

2025-11-25T00:05:42.894986image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.924453image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
johannesburg2751
100.0%

Most occurring characters

ValueCountFrequency (%)
n5502
16.7%
J2751
8.3%
o2751
8.3%
h2751
8.3%
a2751
8.3%
e2751
8.3%
s2751
8.3%
b2751
8.3%
u2751
8.3%
r2751
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter30261
91.7%
Uppercase Letter2751
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n5502
18.2%
o2751
9.1%
h2751
9.1%
a2751
9.1%
e2751
9.1%
s2751
9.1%
b2751
9.1%
u2751
9.1%
r2751
9.1%
g2751
9.1%
Uppercase Letter
ValueCountFrequency (%)
J2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin33012
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n5502
16.7%
J2751
8.3%
o2751
8.3%
h2751
8.3%
a2751
8.3%
e2751
8.3%
s2751
8.3%
b2751
8.3%
u2751
8.3%
r2751
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII33012
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n5502
16.7%
J2751
8.3%
o2751
8.3%
h2751
8.3%
a2751
8.3%
e2751
8.3%
s2751
8.3%
b2751
8.3%
u2751
8.3%
r2751
8.3%

province
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size171.9 KiB
Gauteng
2751 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters19257
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGauteng
2nd rowGauteng
3rd rowGauteng
4th rowGauteng
5th rowGauteng

Common Values

ValueCountFrequency (%)
Gauteng2751
100.0%

Length

2025-11-25T00:05:42.960451image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:42.992232image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
gauteng2751
100.0%

Most occurring characters

ValueCountFrequency (%)
G2751
14.3%
a2751
14.3%
u2751
14.3%
t2751
14.3%
e2751
14.3%
n2751
14.3%
g2751
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16506
85.7%
Uppercase Letter2751
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a2751
16.7%
u2751
16.7%
t2751
16.7%
e2751
16.7%
n2751
16.7%
g2751
16.7%
Uppercase Letter
ValueCountFrequency (%)
G2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin19257
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G2751
14.3%
a2751
14.3%
u2751
14.3%
t2751
14.3%
e2751
14.3%
n2751
14.3%
g2751
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII19257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G2751
14.3%
a2751
14.3%
u2751
14.3%
t2751
14.3%
e2751
14.3%
n2751
14.3%
g2751
14.3%

country
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size185.4 KiB
South Africa
2751 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters33012
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouth Africa
2nd rowSouth Africa
3rd rowSouth Africa
4th rowSouth Africa
5th rowSouth Africa

Common Values

ValueCountFrequency (%)
South Africa2751
100.0%

Length

2025-11-25T00:05:43.026538image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.056620image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
south2751
50.0%
africa2751
50.0%

Most occurring characters

ValueCountFrequency (%)
S2751
8.3%
o2751
8.3%
u2751
8.3%
t2751
8.3%
h2751
8.3%
2751
8.3%
A2751
8.3%
f2751
8.3%
r2751
8.3%
i2751
8.3%
Other values (2)5502
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter24759
75.0%
Uppercase Letter5502
 
16.7%
Space Separator2751
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o2751
11.1%
u2751
11.1%
t2751
11.1%
h2751
11.1%
f2751
11.1%
r2751
11.1%
i2751
11.1%
c2751
11.1%
a2751
11.1%
Uppercase Letter
ValueCountFrequency (%)
S2751
50.0%
A2751
50.0%
Space Separator
ValueCountFrequency (%)
2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin30261
91.7%
Common2751
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S2751
9.1%
o2751
9.1%
u2751
9.1%
t2751
9.1%
h2751
9.1%
A2751
9.1%
f2751
9.1%
r2751
9.1%
i2751
9.1%
c2751
9.1%
Common
ValueCountFrequency (%)
2751
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII33012
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S2751
8.3%
o2751
8.3%
u2751
8.3%
t2751
8.3%
h2751
8.3%
2751
8.3%
A2751
8.3%
f2751
8.3%
r2751
8.3%
i2751
8.3%
Other values (2)5502
16.7%

Age (at enrolment)
Real number (ℝ)

Patient age at study enrollment

Distinct59
Distinct (%)2.1%
Missing6
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean34.426958
Minimum15
Maximum76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:43.092193image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile20
Q127
median33
Q340
95-th percentile54
Maximum76
Range61
Interquartile range (IQR)13

Descriptive statistics

Standard deviation10.178108
Coefficient of variation (CV)0.29564354
Kurtosis0.24473046
Mean34.426958
Median Absolute Deviation (MAD)7
Skewness0.70885633
Sum94502
Variance103.59388
MonotonicityNot monotonic
2025-11-25T00:05:43.137437image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31125
 
4.5%
30117
 
4.3%
29116
 
4.2%
28113
 
4.1%
27108
 
3.9%
32106
 
3.9%
26104
 
3.8%
34102
 
3.7%
24101
 
3.7%
3397
 
3.5%
Other values (49)1656
60.2%
ValueCountFrequency (%)
154
 
0.1%
163
 
0.1%
1715
 
0.5%
1824
 
0.9%
1940
 
1.5%
2059
2.1%
2156
2.0%
2273
2.7%
2385
3.1%
24101
3.7%
ValueCountFrequency (%)
761
 
< 0.1%
741
 
< 0.1%
722
 
0.1%
711
 
< 0.1%
701
 
< 0.1%
692
 
0.1%
683
0.1%
671
 
< 0.1%
661
 
< 0.1%
655
0.2%

Sex
Categorical

Biological sex

Distinct2
Distinct (%)0.1%
Missing4
Missing (%)0.1%
Memory size165.9 KiB
Male
1708 
Female
1039 

Length

Max length6
Median length4
Mean length4.7564616
Min length4

Characters and Unicode

Total characters13066
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowMale
4th rowMale
5th rowFemale

Common Values

ValueCountFrequency (%)
Male1708
62.1%
Female1039
37.8%
(Missing)4
 
0.1%

Length

2025-11-25T00:05:43.186079image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.223075image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
male1708
62.2%
female1039
37.8%

Most occurring characters

ValueCountFrequency (%)
e3786
29.0%
a2747
21.0%
l2747
21.0%
M1708
13.1%
F1039
 
8.0%
m1039
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter10319
79.0%
Uppercase Letter2747
 
21.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e3786
36.7%
a2747
26.6%
l2747
26.6%
m1039
 
10.1%
Uppercase Letter
ValueCountFrequency (%)
M1708
62.2%
F1039
37.8%

Most occurring scripts

ValueCountFrequency (%)
Latin13066
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e3786
29.0%
a2747
21.0%
l2747
21.0%
M1708
13.1%
F1039
 
8.0%
m1039
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII13066
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e3786
29.0%
a2747
21.0%
l2747
21.0%
M1708
13.1%
F1039
 
8.0%
m1039
 
8.0%

CD4 cell count (cells/µL)
Real number (ℝ)

High correlation  Missing 

CD4+ T lymphocyte count (missing codes removed)

Distinct854
Distinct (%)38.5%
Missing533
Missing (%)19.4%
Infinite0
Infinite (%)0.0%
Mean456.95807
Minimum3
Maximum2703
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:43.261791image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile108.85
Q1272
median416
Q3589
95-th percentile937
Maximum2703
Range2700
Interquartile range (IQR)317

Descriptive statistics

Standard deviation268.47946
Coefficient of variation (CV)0.58753632
Kurtosis7.1691831
Mean456.95807
Median Absolute Deviation (MAD)155
Skewness1.6497118
Sum1013533
Variance72081.223
MonotonicityNot monotonic
2025-11-25T00:05:43.309577image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3509
 
0.3%
3159
 
0.3%
5009
 
0.3%
4679
 
0.3%
4208
 
0.3%
3368
 
0.3%
4438
 
0.3%
3548
 
0.3%
4148
 
0.3%
5648
 
0.3%
Other values (844)2134
77.6%
(Missing)533
 
19.4%
ValueCountFrequency (%)
32
0.1%
61
< 0.1%
81
< 0.1%
101
< 0.1%
151
< 0.1%
161
< 0.1%
201
< 0.1%
211
< 0.1%
281
< 0.1%
291
< 0.1%
ValueCountFrequency (%)
27031
< 0.1%
26092
0.1%
19961
< 0.1%
17811
< 0.1%
17251
< 0.1%
15771
< 0.1%
15681
< 0.1%
15641
< 0.1%
15491
< 0.1%
15081
< 0.1%

HIV viral load (copies/mL)
Real number (ℝ)

Missing  Zeros 

HIV RNA copies per mL (missing codes removed)

Distinct45
Distinct (%)15.5%
Missing2461
Missing (%)89.5%
Infinite0
Infinite (%)0.0%
Mean20363.586
Minimum0
Maximum2670000
Zeros246
Zeros (%)8.9%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:43.354202image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile7860.2
Maximum2670000
Range2670000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation196029.65
Coefficient of variation (CV)9.6264796
Kurtosis145.0072
Mean20363.586
Median Absolute Deviation (MAD)0
Skewness11.783887
Sum5905440
Variance3.8427622 × 1010
MonotonicityNot monotonic
2025-11-25T00:05:43.401851image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
0246
 
8.9%
85551
 
< 0.1%
3781
 
< 0.1%
24351
 
< 0.1%
64421
 
< 0.1%
137951
 
< 0.1%
2001
 
< 0.1%
311
 
< 0.1%
18981051
 
< 0.1%
1321
 
< 0.1%
Other values (35)35
 
1.3%
(Missing)2461
89.5%
ValueCountFrequency (%)
0246
8.9%
101
 
< 0.1%
311
 
< 0.1%
511
 
< 0.1%
741
 
< 0.1%
821
 
< 0.1%
871
 
< 0.1%
1321
 
< 0.1%
1431
 
< 0.1%
1741
 
< 0.1%
ValueCountFrequency (%)
26700001
< 0.1%
18981051
< 0.1%
6504421
< 0.1%
1643511
< 0.1%
1492471
< 0.1%
1250541
< 0.1%
440111
< 0.1%
385001
< 0.1%
348681
< 0.1%
222761
< 0.1%

date
Date

Distinct447
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Memory size43.0 KiB
Minimum2013-03-14 00:00:00
Maximum2015-08-01 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-11-25T00:05:43.449516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:43.504914image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Country
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size185.4 KiB
South Africa
2751 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters33012
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouth Africa
2nd rowSouth Africa
3rd rowSouth Africa
4th rowSouth Africa
5th rowSouth Africa

Common Values

ValueCountFrequency (%)
South Africa2751
100.0%

Length

2025-11-25T00:05:43.553707image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.583456image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
south2751
50.0%
africa2751
50.0%

Most occurring characters

ValueCountFrequency (%)
S2751
8.3%
o2751
8.3%
u2751
8.3%
t2751
8.3%
h2751
8.3%
2751
8.3%
A2751
8.3%
f2751
8.3%
r2751
8.3%
i2751
8.3%
Other values (2)5502
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter24759
75.0%
Uppercase Letter5502
 
16.7%
Space Separator2751
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o2751
11.1%
u2751
11.1%
t2751
11.1%
h2751
11.1%
f2751
11.1%
r2751
11.1%
i2751
11.1%
c2751
11.1%
a2751
11.1%
Uppercase Letter
ValueCountFrequency (%)
S2751
50.0%
A2751
50.0%
Space Separator
ValueCountFrequency (%)
2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin30261
91.7%
Common2751
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S2751
9.1%
o2751
9.1%
u2751
9.1%
t2751
9.1%
h2751
9.1%
A2751
9.1%
f2751
9.1%
r2751
9.1%
i2751
9.1%
c2751
9.1%
Common
ValueCountFrequency (%)
2751
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII33012
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S2751
8.3%
o2751
8.3%
u2751
8.3%
t2751
8.3%
h2751
8.3%
2751
8.3%
A2751
8.3%
f2751
8.3%
r2751
8.3%
i2751
8.3%
Other values (2)5502
16.7%

Clinical Study ID
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size228.4 KiB
Tholimpilo_HIV_Linkage_Study
2751 

Length

Max length28
Median length28
Mean length28
Min length28

Characters and Unicode

Total characters77028
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTholimpilo_HIV_Linkage_Study
2nd rowTholimpilo_HIV_Linkage_Study
3rd rowTholimpilo_HIV_Linkage_Study
4th rowTholimpilo_HIV_Linkage_Study
5th rowTholimpilo_HIV_Linkage_Study

Common Values

ValueCountFrequency (%)
Tholimpilo_HIV_Linkage_Study2751
100.0%

Length

2025-11-25T00:05:43.618530image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.651213image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
tholimpilo_hiv_linkage_study2751
100.0%

Most occurring characters

ValueCountFrequency (%)
i8253
 
10.7%
_8253
 
10.7%
o5502
 
7.1%
l5502
 
7.1%
T2751
 
3.6%
k2751
 
3.6%
d2751
 
3.6%
u2751
 
3.6%
t2751
 
3.6%
S2751
 
3.6%
Other values (12)33012
42.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter52269
67.9%
Uppercase Letter16506
 
21.4%
Connector Punctuation8253
 
10.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i8253
15.8%
o5502
 
10.5%
l5502
 
10.5%
k2751
 
5.3%
d2751
 
5.3%
u2751
 
5.3%
t2751
 
5.3%
e2751
 
5.3%
g2751
 
5.3%
a2751
 
5.3%
Other values (5)13755
26.3%
Uppercase Letter
ValueCountFrequency (%)
T2751
16.7%
S2751
16.7%
L2751
16.7%
V2751
16.7%
I2751
16.7%
H2751
16.7%
Connector Punctuation
ValueCountFrequency (%)
_8253
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin68775
89.3%
Common8253
 
10.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i8253
 
12.0%
o5502
 
8.0%
l5502
 
8.0%
T2751
 
4.0%
k2751
 
4.0%
d2751
 
4.0%
u2751
 
4.0%
t2751
 
4.0%
S2751
 
4.0%
e2751
 
4.0%
Other values (11)30261
44.0%
Common
ValueCountFrequency (%)
_8253
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII77028
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i8253
 
10.7%
_8253
 
10.7%
o5502
 
7.1%
l5502
 
7.1%
T2751
 
3.6%
k2751
 
3.6%
d2751
 
3.6%
u2751
 
3.6%
t2751
 
3.6%
S2751
 
3.6%
Other values (12)33012
42.9%

Location of study follow-up
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size282.1 KiB
Aurum Institute - Multi-site Gauteng and Limpopo
2751 

Length

Max length48
Median length48
Mean length48
Min length48

Characters and Unicode

Total characters132048
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAurum Institute - Multi-site Gauteng and Limpopo
2nd rowAurum Institute - Multi-site Gauteng and Limpopo
3rd rowAurum Institute - Multi-site Gauteng and Limpopo
4th rowAurum Institute - Multi-site Gauteng and Limpopo
5th rowAurum Institute - Multi-site Gauteng and Limpopo

Common Values

ValueCountFrequency (%)
Aurum Institute - Multi-site Gauteng and Limpopo2751
100.0%

Length

2025-11-25T00:05:43.685225image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.717363image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
aurum2751
14.3%
institute2751
14.3%
2751
14.3%
multi-site2751
14.3%
gauteng2751
14.3%
and2751
14.3%
limpopo2751
14.3%

Most occurring characters

ValueCountFrequency (%)
16506
12.5%
t16506
12.5%
u13755
 
10.4%
i11004
 
8.3%
e8253
 
6.2%
n8253
 
6.2%
p5502
 
4.2%
a5502
 
4.2%
-5502
 
4.2%
o5502
 
4.2%
Other values (11)35763
27.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter96285
72.9%
Space Separator16506
 
12.5%
Uppercase Letter13755
 
10.4%
Dash Punctuation5502
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t16506
17.1%
u13755
14.3%
i11004
11.4%
e8253
8.6%
n8253
8.6%
p5502
 
5.7%
a5502
 
5.7%
o5502
 
5.7%
s5502
 
5.7%
m5502
 
5.7%
Other values (4)11004
11.4%
Uppercase Letter
ValueCountFrequency (%)
I2751
20.0%
M2751
20.0%
G2751
20.0%
L2751
20.0%
A2751
20.0%
Space Separator
ValueCountFrequency (%)
16506
100.0%
Dash Punctuation
ValueCountFrequency (%)
-5502
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin110040
83.3%
Common22008
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t16506
15.0%
u13755
12.5%
i11004
10.0%
e8253
 
7.5%
n8253
 
7.5%
p5502
 
5.0%
a5502
 
5.0%
o5502
 
5.0%
s5502
 
5.0%
m5502
 
5.0%
Other values (9)24759
22.5%
Common
ValueCountFrequency (%)
16506
75.0%
-5502
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII132048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16506
12.5%
t16506
12.5%
u13755
 
10.4%
i11004
 
8.3%
e8253
 
6.2%
n8253
 
6.2%
p5502
 
4.2%
a5502
 
4.2%
-5502
 
4.2%
o5502
 
4.2%
Other values (11)35763
27.1%

coordinate_source
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.1 KiB
JHB_Aurum_009
2751 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters35763
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJHB_Aurum_009
2nd rowJHB_Aurum_009
3rd rowJHB_Aurum_009
4th rowJHB_Aurum_009
5th rowJHB_Aurum_009

Common Values

ValueCountFrequency (%)
JHB_Aurum_0092751
100.0%

Length

2025-11-25T00:05:43.752364image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.783979image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
jhb_aurum_0092751
100.0%

Most occurring characters

ValueCountFrequency (%)
_5502
15.4%
u5502
15.4%
05502
15.4%
J2751
7.7%
H2751
7.7%
B2751
7.7%
A2751
7.7%
r2751
7.7%
m2751
7.7%
92751
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11004
30.8%
Uppercase Letter11004
30.8%
Decimal Number8253
23.1%
Connector Punctuation5502
15.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
J2751
25.0%
H2751
25.0%
B2751
25.0%
A2751
25.0%
Lowercase Letter
ValueCountFrequency (%)
u5502
50.0%
r2751
25.0%
m2751
25.0%
Decimal Number
ValueCountFrequency (%)
05502
66.7%
92751
33.3%
Connector Punctuation
ValueCountFrequency (%)
_5502
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin22008
61.5%
Common13755
38.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
u5502
25.0%
J2751
12.5%
H2751
12.5%
B2751
12.5%
A2751
12.5%
r2751
12.5%
m2751
12.5%
Common
ValueCountFrequency (%)
_5502
40.0%
05502
40.0%
92751
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII35763
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_5502
15.4%
u5502
15.4%
05502
15.4%
J2751
7.7%
H2751
7.7%
B2751
7.7%
A2751
7.7%
r2751
7.7%
m2751
7.7%
92751
7.7%

coordinate_precision
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size163.9 KiB
high
2751 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters11004
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhigh
2nd rowhigh
3rd rowhigh
4th rowhigh
5th rowhigh

Common Values

ValueCountFrequency (%)
high2751
100.0%

Length

2025-11-25T00:05:43.821695image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.855374image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
high2751
100.0%

Most occurring characters

ValueCountFrequency (%)
h5502
50.0%
i2751
25.0%
g2751
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11004
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
h5502
50.0%
i2751
25.0%
g2751
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11004
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
h5502
50.0%
i2751
25.0%
g2751
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11004
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
h5502
50.0%
i2751
25.0%
g2751
25.0%

geographic_source
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size204.2 KiB
harmonized_datasets
2751 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters52269
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowharmonized_datasets
2nd rowharmonized_datasets
3rd rowharmonized_datasets
4th rowharmonized_datasets
5th rowharmonized_datasets

Common Values

ValueCountFrequency (%)
harmonized_datasets2751
100.0%

Length

2025-11-25T00:05:43.893019image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:43.927326image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
harmonized_datasets2751
100.0%

Most occurring characters

ValueCountFrequency (%)
a8253
15.8%
e5502
10.5%
d5502
10.5%
t5502
10.5%
s5502
10.5%
h2751
 
5.3%
r2751
 
5.3%
m2751
 
5.3%
o2751
 
5.3%
n2751
 
5.3%
Other values (3)8253
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter49518
94.7%
Connector Punctuation2751
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a8253
16.7%
e5502
11.1%
d5502
11.1%
t5502
11.1%
s5502
11.1%
h2751
 
5.6%
r2751
 
5.6%
m2751
 
5.6%
o2751
 
5.6%
n2751
 
5.6%
Other values (2)5502
11.1%
Connector Punctuation
ValueCountFrequency (%)
_2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin49518
94.7%
Common2751
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a8253
16.7%
e5502
11.1%
d5502
11.1%
t5502
11.1%
s5502
11.1%
h2751
 
5.6%
r2751
 
5.6%
m2751
 
5.6%
o2751
 
5.6%
n2751
 
5.6%
Other values (2)5502
11.1%
Common
ValueCountFrequency (%)
_2751
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII52269
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a8253
15.8%
e5502
10.5%
d5502
10.5%
t5502
10.5%
s5502
10.5%
h2751
 
5.3%
r2751
 
5.3%
m2751
 
5.3%
o2751
 
5.3%
n2751
 
5.3%
Other values (3)8253
15.8%

HIV_status
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size174.6 KiB
Positive
2751 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters22008
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPositive
2nd rowPositive
3rd rowPositive
4th rowPositive
5th rowPositive

Common Values

ValueCountFrequency (%)
Positive2751
100.0%

Length

2025-11-25T00:05:43.966487image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:44.001396image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
positive2751
100.0%

Most occurring characters

ValueCountFrequency (%)
i5502
25.0%
P2751
12.5%
o2751
12.5%
s2751
12.5%
t2751
12.5%
v2751
12.5%
e2751
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter19257
87.5%
Uppercase Letter2751
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i5502
28.6%
o2751
14.3%
s2751
14.3%
t2751
14.3%
v2751
14.3%
e2751
14.3%
Uppercase Letter
ValueCountFrequency (%)
P2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin22008
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i5502
25.0%
P2751
12.5%
o2751
12.5%
s2751
12.5%
t2751
12.5%
v2751
12.5%
e2751
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII22008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i5502
25.0%
P2751
12.5%
o2751
12.5%
s2751
12.5%
t2751
12.5%
v2751
12.5%
e2751
12.5%

johannesburg_metro_valid
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
1.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.02751
100.0%

Length

2025-11-25T00:05:44.035749image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:44.068749image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12751
50.0%
02751
50.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

study_site_location
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size247.2 KiB
Tembisa/East Rand (Aurum Institute)
2751 

Length

Max length35
Median length35
Mean length35
Min length35

Characters and Unicode

Total characters96285
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTembisa/East Rand (Aurum Institute)
2nd rowTembisa/East Rand (Aurum Institute)
3rd rowTembisa/East Rand (Aurum Institute)
4th rowTembisa/East Rand (Aurum Institute)
5th rowTembisa/East Rand (Aurum Institute)

Common Values

ValueCountFrequency (%)
Tembisa/East Rand (Aurum Institute)2751
100.0%

Length

2025-11-25T00:05:44.103220image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:44.136621image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
tembisa/east2751
25.0%
rand2751
25.0%
aurum2751
25.0%
institute2751
25.0%

Most occurring characters

ValueCountFrequency (%)
t11004
 
11.4%
8253
 
8.6%
s8253
 
8.6%
a8253
 
8.6%
u8253
 
8.6%
m5502
 
5.7%
i5502
 
5.7%
e5502
 
5.7%
n5502
 
5.7%
(2751
 
2.9%
Other values (10)27510
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter66024
68.6%
Uppercase Letter13755
 
14.3%
Space Separator8253
 
8.6%
Open Punctuation2751
 
2.9%
Other Punctuation2751
 
2.9%
Close Punctuation2751
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t11004
16.7%
s8253
12.5%
a8253
12.5%
u8253
12.5%
m5502
8.3%
i5502
8.3%
e5502
8.3%
n5502
8.3%
r2751
 
4.2%
d2751
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
I2751
20.0%
A2751
20.0%
T2751
20.0%
R2751
20.0%
E2751
20.0%
Space Separator
ValueCountFrequency (%)
8253
100.0%
Open Punctuation
ValueCountFrequency (%)
(2751
100.0%
Other Punctuation
ValueCountFrequency (%)
/2751
100.0%
Close Punctuation
ValueCountFrequency (%)
)2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin79779
82.9%
Common16506
 
17.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t11004
13.8%
s8253
10.3%
a8253
10.3%
u8253
10.3%
m5502
 
6.9%
i5502
 
6.9%
e5502
 
6.9%
n5502
 
6.9%
I2751
 
3.4%
r2751
 
3.4%
Other values (6)16506
20.7%
Common
ValueCountFrequency (%)
8253
50.0%
(2751
 
16.7%
/2751
 
16.7%
)2751
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII96285
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t11004
 
11.4%
8253
 
8.6%
s8253
 
8.6%
a8253
 
8.6%
u8253
 
8.6%
m5502
 
5.7%
i5502
 
5.7%
e5502
 
5.7%
n5502
 
5.7%
(2751
 
2.9%
Other values (10)27510
28.6%

climate_daily_mean_temp
Real number (ℝ)

High correlation  Missing 

Daily mean temperature

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean15.451807
Minimum9.356
Maximum23.589
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.164147image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum9.356
5-th percentile9.356
Q113.213
median14.195
Q319.293
95-th percentile23.589
Maximum23.589
Range14.233
Interquartile range (IQR)6.08

Descriptive statistics

Standard deviation3.5385321
Coefficient of variation (CV)0.22900442
Kurtosis-0.30036519
Mean15.451807
Median Absolute Deviation (MAD)0.982
Skewness0.47348153
Sum17537.801
Variance12.521209
MonotonicityNot monotonic
2025-11-25T00:05:44.199510image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
19.293214
 
7.8%
13.213208
 
7.6%
14.195187
 
6.8%
13.868144
 
5.2%
9.35698
 
3.6%
18.20367
 
2.4%
23.58962
 
2.3%
13.65653
 
1.9%
13.31641
 
1.5%
17.79939
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
9.35698
3.6%
13.213208
7.6%
13.31641
 
1.5%
13.65653
 
1.9%
13.868144
5.2%
14.195187
6.8%
17.79939
 
1.4%
18.20367
 
2.4%
19.293214
7.8%
20.29322
 
0.8%
ValueCountFrequency (%)
23.58962
 
2.3%
20.29322
 
0.8%
19.293214
7.8%
18.20367
 
2.4%
17.79939
 
1.4%
14.195187
6.8%
13.868144
5.2%
13.65653
 
1.9%
13.31641
 
1.5%
13.213208
7.6%

climate_daily_max_temp
Real number (ℝ)

High correlation  Missing 

Daily maximum temperature

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean23.182599
Minimum17.553
Maximum30.083
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.238151image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum17.553
5-th percentile17.553
Q121.474
median22.413
Q326.343
95-th percentile30.083
Maximum30.083
Range12.53
Interquartile range (IQR)4.869

Descriptive statistics

Standard deviation2.9483779
Coefficient of variation (CV)0.12718065
Kurtosis0.15361931
Mean23.182599
Median Absolute Deviation (MAD)1.066
Skewness0.324421
Sum26312.25
Variance8.6929324
MonotonicityNot monotonic
2025-11-25T00:05:44.275911image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
26.343214
 
7.8%
22.23208
 
7.6%
23.023187
 
6.8%
21.347144
 
5.2%
17.55398
 
3.6%
22.41367
 
2.4%
30.08362
 
2.3%
21.47453
 
1.9%
20.76841
 
1.5%
25.839
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
17.55398
3.6%
20.76841
 
1.5%
21.347144
5.2%
21.47453
 
1.9%
22.23208
7.6%
22.41367
 
2.4%
23.023187
6.8%
25.839
 
1.4%
26.343214
7.8%
26.76922
 
0.8%
ValueCountFrequency (%)
30.08362
 
2.3%
26.76922
 
0.8%
26.343214
7.8%
25.839
 
1.4%
23.023187
6.8%
22.41367
 
2.4%
22.23208
7.6%
21.47453
 
1.9%
21.347144
5.2%
20.76841
 
1.5%

climate_daily_min_temp
Real number (ℝ)

High correlation  Missing 

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean7.5503286
Minimum2.343
Maximum14.954
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.312588image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2.343
5-th percentile2.343
Q13.763
median6.616
Q311.253
95-th percentile14.954
Maximum14.954
Range12.611
Interquartile range (IQR)7.49

Descriptive statistics

Standard deviation4.0456474
Coefficient of variation (CV)0.53582401
Kurtosis-1.0855077
Mean7.5503286
Median Absolute Deviation (MAD)2.853
Skewness0.50562955
Sum8569.623
Variance16.367263
MonotonicityNot monotonic
2025-11-25T00:05:44.349968image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
11.253214
 
7.8%
3.763208
 
7.6%
4.56187
 
6.8%
7.436144
 
5.2%
2.34398
 
3.6%
14.7967
 
2.4%
14.95462
 
2.3%
6.03453
 
1.9%
6.61641
 
1.5%
10.49339
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
2.34398
3.6%
3.763208
7.6%
4.56187
6.8%
6.03453
 
1.9%
6.61641
 
1.5%
7.436144
5.2%
10.49339
 
1.4%
11.253214
7.8%
13.96822
 
0.8%
14.7967
 
2.4%
ValueCountFrequency (%)
14.95462
 
2.3%
14.7967
 
2.4%
13.96822
 
0.8%
11.253214
7.8%
10.49339
 
1.4%
7.436144
5.2%
6.61641
 
1.5%
6.03453
 
1.9%
4.56187
6.8%
3.763208
7.6%

climate_7d_mean_temp
Real number (ℝ)

High correlation  Missing 

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean15.139061
Minimum9.215
Maximum21.742
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.384925image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum9.215
5-th percentile9.215
Q111.927
median16.313
Q319.038
95-th percentile21.742
Maximum21.742
Range12.527
Interquartile range (IQR)7.111

Descriptive statistics

Standard deviation3.6217705
Coefficient of variation (CV)0.2392335
Kurtosis-1.2015134
Mean15.139061
Median Absolute Deviation (MAD)3.532
Skewness0.034902052
Sum17182.834
Variance13.117222
MonotonicityNot monotonic
2025-11-25T00:05:44.424321image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
19.038214
 
7.8%
16.313208
 
7.6%
11.927187
 
6.8%
12.781144
 
5.2%
9.21598
 
3.6%
18.25467
 
2.4%
21.74262
 
2.3%
10.79353
 
1.9%
12.66541
 
1.5%
16.47139
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
9.21598
3.6%
10.79353
 
1.9%
11.927187
6.8%
12.66541
 
1.5%
12.781144
5.2%
16.313208
7.6%
16.47139
 
1.4%
18.25467
 
2.4%
19.038214
7.8%
19.86522
 
0.8%
ValueCountFrequency (%)
21.74262
 
2.3%
19.86522
 
0.8%
19.038214
7.8%
18.25467
 
2.4%
16.47139
 
1.4%
16.313208
7.6%
12.781144
5.2%
12.66541
 
1.5%
11.927187
6.8%
10.79353
 
1.9%

climate_7d_max_temp
Real number (ℝ)

High correlation  Missing 

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean25.916914
Minimum17.721
Maximum30.867
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.460316image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum17.721
5-th percentile17.721
Q121.977
median26.996
Q329.423
95-th percentile30.867
Maximum30.867
Range13.146
Interquartile range (IQR)7.446

Descriptive statistics

Standard deviation4.0747905
Coefficient of variation (CV)0.15722514
Kurtosis-0.91203039
Mean25.916914
Median Absolute Deviation (MAD)2.708
Skewness-0.60654343
Sum29415.697
Variance16.603917
MonotonicityNot monotonic
2025-11-25T00:05:44.499475image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
29.704214
 
7.8%
29.423208
 
7.6%
25.079187
 
6.8%
21.52144
 
5.2%
17.72198
 
3.6%
26.99667
 
2.4%
30.86762
 
2.3%
21.97753
 
1.9%
20.76841
 
1.5%
26.76139
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
17.72198
3.6%
20.76841
 
1.5%
21.52144
5.2%
21.97753
 
1.9%
25.079187
6.8%
26.76139
 
1.4%
26.99667
 
2.4%
28.69622
 
0.8%
29.423208
7.6%
29.704214
7.8%
ValueCountFrequency (%)
30.86762
 
2.3%
29.704214
7.8%
29.423208
7.6%
28.69622
 
0.8%
26.99667
 
2.4%
26.76139
 
1.4%
25.079187
6.8%
21.97753
 
1.9%
21.52144
5.2%
20.76841
 
1.5%

climate_14d_mean_temp
Real number (ℝ)

High correlation  Missing 

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean16.042067
Minimum10.426
Maximum21.69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.535921image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum10.426
5-th percentile10.426
Q112.258
median18.254
Q319.069
95-th percentile21.69
Maximum21.69
Range11.264
Interquartile range (IQR)6.811

Descriptive statistics

Standard deviation3.3876747
Coefficient of variation (CV)0.21117446
Kurtosis-1.3067725
Mean16.042067
Median Absolute Deviation (MAD)3.436
Skewness-0.22262646
Sum18207.746
Variance11.47634
MonotonicityNot monotonic
2025-11-25T00:05:44.572268image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
19.069214
 
7.8%
18.483208
 
7.6%
14.595187
 
6.8%
12.258144
 
5.2%
10.42698
 
3.6%
18.25467
 
2.4%
21.6962
 
2.3%
11.53253
 
1.9%
12.5741
 
1.5%
16.05739
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
10.42698
3.6%
11.53253
 
1.9%
12.258144
5.2%
12.5741
 
1.5%
14.595187
6.8%
16.05739
 
1.4%
18.25467
 
2.4%
18.483208
7.6%
19.069214
7.8%
20.26222
 
0.8%
ValueCountFrequency (%)
21.6962
 
2.3%
20.26222
 
0.8%
19.069214
7.8%
18.483208
7.6%
18.25467
 
2.4%
16.05739
 
1.4%
14.595187
6.8%
12.5741
 
1.5%
12.258144
5.2%
11.53253
 
1.9%

climate_30d_mean_temp
Real number (ℝ)

High correlation  Missing 

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean16.024641
Minimum10.635
Maximum21.041
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.609160image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum10.635
5-th percentile10.635
Q111.635
median18.576
Q318.854
95-th percentile21.041
Maximum21.041
Range10.406
Interquartile range (IQR)7.219

Descriptive statistics

Standard deviation3.4391822
Coefficient of variation (CV)0.21461837
Kurtosis-1.3241931
Mean16.024641
Median Absolute Deviation (MAD)2.465
Skewness-0.42123309
Sum18187.967
Variance11.827974
MonotonicityNot monotonic
2025-11-25T00:05:44.645910image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
18.854214
 
7.8%
18.576208
 
7.6%
15.421187
 
6.8%
11.076144
 
5.2%
10.63598
 
3.6%
18.79467
 
2.4%
21.04162
 
2.3%
11.63553
 
1.9%
12.85641
 
1.5%
15.77539
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
10.63598
3.6%
11.076144
5.2%
11.63553
 
1.9%
12.85641
 
1.5%
15.421187
6.8%
15.77539
 
1.4%
18.576208
7.6%
18.79467
 
2.4%
18.854214
7.8%
20.26322
 
0.8%
ValueCountFrequency (%)
21.04162
 
2.3%
20.26322
 
0.8%
18.854214
7.8%
18.79467
 
2.4%
18.576208
7.6%
15.77539
 
1.4%
15.421187
6.8%
12.85641
 
1.5%
11.63553
 
1.9%
11.076144
5.2%

climate_temp_anomaly
Real number (ℝ)

High correlation  Missing 

Temperature anomaly from baseline

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean7.1579163
Minimum3.618
Maximum10.271
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:44.680340image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum3.618
5-th percentile3.618
Q16.505
median7.489
Q39.042
95-th percentile10.271
Maximum10.271
Range6.653
Interquartile range (IQR)2.537

Descriptive statistics

Standard deviation2.2633511
Coefficient of variation (CV)0.31620252
Kurtosis-0.9276673
Mean7.1579163
Median Absolute Deviation (MAD)1.553
Skewness-0.39512055
Sum8124.235
Variance5.1227584
MonotonicityNot monotonic
2025-11-25T00:05:44.716062image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
7.489214
 
7.8%
3.654208
 
7.6%
7.602187
 
6.8%
10.271144
 
5.2%
6.91898
 
3.6%
3.61867
 
2.4%
9.04262
 
2.3%
9.83953
 
1.9%
7.91341
 
1.5%
10.02539
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
3.61867
 
2.4%
3.654208
7.6%
6.50522
 
0.8%
6.91898
3.6%
7.489214
7.8%
7.602187
6.8%
7.91341
 
1.5%
9.04262
 
2.3%
9.83953
 
1.9%
10.02539
 
1.4%
ValueCountFrequency (%)
10.271144
5.2%
10.02539
 
1.4%
9.83953
 
1.9%
9.04262
 
2.3%
7.91341
 
1.5%
7.602187
6.8%
7.489214
7.8%
6.91898
3.6%
6.50522
 
0.8%
3.654208
7.6%

climate_standardized_anomaly
Real number (ℝ)

High correlation  Missing 

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean-0.19795242
Minimum-1.853
Maximum1.905
Zeros0
Zeros (%)0.0%
Negative560
Negative (%)20.4%
Memory size43.0 KiB
2025-11-25T00:05:44.755045image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1.853
5-th percentile-1.853
Q1-1.189
median0.007
Q31.074
95-th percentile1.905
Maximum1.905
Range3.758
Interquartile range (IQR)2.263

Descriptive statistics

Standard deviation1.3126605
Coefficient of variation (CV)-6.6311919
Kurtosis-1.2451726
Mean-0.19795242
Median Absolute Deviation (MAD)1.099
Skewness0.36631188
Sum-224.676
Variance1.7230776
MonotonicityNot monotonic
2025-11-25T00:05:44.794819image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0.007214
 
7.8%
-1.853208
 
7.6%
-1.092187
 
6.8%
1.781144
 
5.2%
-1.18998
 
3.6%
-0.75267
 
2.4%
1.90562
 
2.3%
1.60453
 
1.9%
0.1941
 
1.5%
1.07439
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
-1.853208
7.6%
-1.18998
3.6%
-1.092187
6.8%
-0.75267
 
2.4%
0.007214
7.8%
0.1941
 
1.5%
0.95922
 
0.8%
1.07439
 
1.4%
1.60453
 
1.9%
1.781144
5.2%
ValueCountFrequency (%)
1.90562
 
2.3%
1.781144
5.2%
1.60453
 
1.9%
1.07439
 
1.4%
0.95922
 
0.8%
0.1941
 
1.5%
0.007214
7.8%
-0.75267
 
2.4%
-1.092187
6.8%
-1.18998
3.6%

climate_heat_day_p90
Categorical

High correlation  Imbalance  Missing 

Heat day indicator (>90th percentile)

Distinct2
Distinct (%)0.2%
Missing1616
Missing (%)58.7%
Memory size167.5 KiB
0.0
1073 
1.0
 
62

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3405
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.01073
39.0%
1.062
 
2.3%
(Missing)1616
58.7%

Length

2025-11-25T00:05:44.930623image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:44.965042image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.01073
94.5%
1.062
 
5.5%

Most occurring characters

ValueCountFrequency (%)
02208
64.8%
.1135
33.3%
162
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2270
66.7%
Other Punctuation1135
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02208
97.3%
162
 
2.7%
Other Punctuation
ValueCountFrequency (%)
.1135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common3405
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02208
64.8%
.1135
33.3%
162
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII3405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02208
64.8%
.1135
33.3%
162
 
1.8%

climate_heat_day_p95
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.2%
Missing1616
Missing (%)58.7%
Memory size167.5 KiB
0.0
1073 
1.0
 
62

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3405
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.01073
39.0%
1.062
 
2.3%
(Missing)1616
58.7%

Length

2025-11-25T00:05:45.001724image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.034793image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.01073
94.5%
1.062
 
5.5%

Most occurring characters

ValueCountFrequency (%)
02208
64.8%
.1135
33.3%
162
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2270
66.7%
Other Punctuation1135
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02208
97.3%
162
 
2.7%
Other Punctuation
ValueCountFrequency (%)
.1135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common3405
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02208
64.8%
.1135
33.3%
162
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII3405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02208
64.8%
.1135
33.3%
162
 
1.8%

climate_heat_stress_index
Real number (ℝ)

High correlation  Missing 

Heat stress index

Distinct11
Distinct (%)1.0%
Missing1616
Missing (%)58.7%
Infinite0
Infinite (%)0.0%
Mean18.312848
Minimum13.428
Maximum27.393
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2025-11-25T00:05:45.063382image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum13.428
5-th percentile13.639
Q114.306
median17.923
Q321.523
95-th percentile27.393
Maximum27.393
Range13.965
Interquartile range (IQR)7.217

Descriptive statistics

Standard deviation3.536553
Coefficient of variation (CV)0.19311867
Kurtosis0.20907555
Mean18.312848
Median Absolute Deviation (MAD)3.6
Skewness0.58250383
Sum20785.083
Variance12.507207
MonotonicityNot monotonic
2025-11-25T00:05:45.101995image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
21.523214
 
7.8%
19.275208
 
7.6%
17.347187
 
6.8%
14.306144
 
5.2%
13.63998
 
3.6%
17.92367
 
2.4%
27.39362
 
2.3%
13.42853
 
1.9%
15.72141
 
1.5%
19.95839
 
1.4%
(Missing)1616
58.7%
ValueCountFrequency (%)
13.42853
 
1.9%
13.63998
3.6%
14.306144
5.2%
15.72141
 
1.5%
17.347187
6.8%
17.92367
 
2.4%
19.275208
7.6%
19.95839
 
1.4%
21.523214
7.8%
22.52622
 
0.8%
ValueCountFrequency (%)
27.39362
 
2.3%
22.52622
 
0.8%
21.523214
7.8%
19.95839
 
1.4%
19.275208
7.6%
17.92367
 
2.4%
17.347187
6.8%
15.72141
 
1.5%
14.306144
5.2%
13.63998
3.6%

climate_p90_threshold
Categorical

Constant  Missing 

Distinct1
Distinct (%)0.1%
Missing1616
Missing (%)58.7%
Memory size170.8 KiB
28.409
1135 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6810
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28.409
2nd row28.409
3rd row28.409
4th row28.409
5th row28.409

Common Values

ValueCountFrequency (%)
28.4091135
41.3%
(Missing)1616
58.7%

Length

2025-11-25T00:05:45.145462image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.178482image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
28.4091135
100.0%

Most occurring characters

ValueCountFrequency (%)
21135
16.7%
81135
16.7%
.1135
16.7%
41135
16.7%
01135
16.7%
91135
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5675
83.3%
Other Punctuation1135
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
21135
20.0%
81135
20.0%
41135
20.0%
01135
20.0%
91135
20.0%
Other Punctuation
ValueCountFrequency (%)
.1135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common6810
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
21135
16.7%
81135
16.7%
.1135
16.7%
41135
16.7%
01135
16.7%
91135
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII6810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21135
16.7%
81135
16.7%
.1135
16.7%
41135
16.7%
01135
16.7%
91135
16.7%

climate_p95_threshold
Categorical

Constant  Missing 

Distinct1
Distinct (%)0.1%
Missing1616
Missing (%)58.7%
Memory size170.8 KiB
29.704
1135 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6810
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row29.704
2nd row29.704
3rd row29.704
4th row29.704
5th row29.704

Common Values

ValueCountFrequency (%)
29.7041135
41.3%
(Missing)1616
58.7%

Length

2025-11-25T00:05:45.214398image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.245840image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
29.7041135
100.0%

Most occurring characters

ValueCountFrequency (%)
21135
16.7%
91135
16.7%
.1135
16.7%
71135
16.7%
01135
16.7%
41135
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5675
83.3%
Other Punctuation1135
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
21135
20.0%
91135
20.0%
71135
20.0%
01135
20.0%
41135
20.0%
Other Punctuation
ValueCountFrequency (%)
.1135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common6810
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
21135
16.7%
91135
16.7%
.1135
16.7%
71135
16.7%
01135
16.7%
41135
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII6810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21135
16.7%
91135
16.7%
.1135
16.7%
71135
16.7%
01135
16.7%
41135
16.7%

climate_p99_threshold
Categorical

Constant  Missing 

Distinct1
Distinct (%)0.1%
Missing1616
Missing (%)58.7%
Memory size170.8 KiB
31.797
1135 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6810
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row31.797
2nd row31.797
3rd row31.797
4th row31.797
5th row31.797

Common Values

ValueCountFrequency (%)
31.7971135
41.3%
(Missing)1616
58.7%

Length

2025-11-25T00:05:45.281847image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.313950image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
31.7971135
100.0%

Most occurring characters

ValueCountFrequency (%)
72270
33.3%
31135
16.7%
11135
16.7%
.1135
16.7%
91135
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5675
83.3%
Other Punctuation1135
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
72270
40.0%
31135
20.0%
11135
20.0%
91135
20.0%
Other Punctuation
ValueCountFrequency (%)
.1135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common6810
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
72270
33.3%
31135
16.7%
11135
16.7%
.1135
16.7%
91135
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII6810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
72270
33.3%
31135
16.7%
11135
16.7%
.1135
16.7%
91135
16.7%

climate_season
Categorical

High correlation  Missing 

Distinct4
Distinct (%)0.4%
Missing1616
Missing (%)58.7%
Memory size170.8 KiB
Spring
609 
Winter
295 
Summer
129 
Autumn
102 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6810
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAutumn
2nd rowSpring
3rd rowWinter
4th rowSpring
5th rowSpring

Common Values

ValueCountFrequency (%)
Spring609
 
22.1%
Winter295
 
10.7%
Summer129
 
4.7%
Autumn102
 
3.7%
(Missing)1616
58.7%

Length

2025-11-25T00:05:45.350416image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.387396image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
spring609
53.7%
winter295
26.0%
summer129
 
11.4%
autumn102
 
9.0%

Most occurring characters

ValueCountFrequency (%)
r1033
15.2%
n1006
14.8%
i904
13.3%
S738
10.8%
p609
8.9%
g609
8.9%
e424
6.2%
t397
 
5.8%
m360
 
5.3%
u333
 
4.9%
Other values (2)397
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5675
83.3%
Uppercase Letter1135
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r1033
18.2%
n1006
17.7%
i904
15.9%
p609
10.7%
g609
10.7%
e424
7.5%
t397
 
7.0%
m360
 
6.3%
u333
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
S738
65.0%
W295
 
26.0%
A102
 
9.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6810
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r1033
15.2%
n1006
14.8%
i904
13.3%
S738
10.8%
p609
8.9%
g609
8.9%
e424
6.2%
t397
 
5.8%
m360
 
5.3%
u333
 
4.9%
Other values (2)397
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII6810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r1033
15.2%
n1006
14.8%
i904
13.3%
S738
10.8%
p609
8.9%
g609
8.9%
e424
6.2%
t397
 
5.8%
m360
 
5.3%
u333
 
4.9%
Other values (2)397
 
5.8%

sa_biomarker_standards
Categorical

Constant 

South African biomarker reference standards

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
1.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.02751
100.0%

Length

2025-11-25T00:05:45.429639image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.462690image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12751
50.0%
02751
50.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

cd4_correction_applied
Categorical

High correlation  Imbalance 

Quality flag: CD4 missing codes removed

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
0.0
2696 
1.0
 
55

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.02696
98.0%
1.055
 
2.0%

Length

2025-11-25T00:05:45.496790image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.531398image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.02696
98.0%
1.055
 
2.0%

Most occurring characters

ValueCountFrequency (%)
05447
66.0%
.2751
33.3%
155
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05447
99.0%
155
 
1.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05447
66.0%
.2751
33.3%
155
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05447
66.0%
.2751
33.3%
155
 
0.7%

final_comprehensive_fix_applied
Categorical

Constant 

Quality flag: Comprehensive corrections applied

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
1.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.02751
100.0%

Length

2025-11-25T00:05:45.566589image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.600116image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12751
50.0%
02751
50.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12751
33.3%
.2751
33.3%
02751
33.3%

total_protein_extreme_flag
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
0.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.02751
100.0%

Length

2025-11-25T00:05:45.634879image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.667678image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05502
100.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05502
66.7%
.2751
33.3%
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
0.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.02751
100.0%

Length

2025-11-25T00:05:45.703527image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.735914image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05502
100.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05502
66.7%
.2751
33.3%
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
0.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.02751
100.0%

Length

2025-11-25T00:05:45.772486image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.805352image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05502
100.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05502
66.7%
.2751
33.3%

quality_harmonization_version
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size161.2 KiB
2.0
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters8253
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.02751
100.0%

Length

2025-11-25T00:05:45.840205image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-25T00:05:45.873836image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2.02751
100.0%

Most occurring characters

ValueCountFrequency (%)
22751
33.3%
.2751
33.3%
02751
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5502
66.7%
Other Punctuation2751
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
22751
50.0%
02751
50.0%
Other Punctuation
ValueCountFrequency (%)
.2751
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8253
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
22751
33.3%
.2751
33.3%
02751
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22751
33.3%
.2751
33.3%
02751
33.3%

waist_circ_unit_correction_applied
Boolean

Constant 

Quality flag: Waist circ unit corrected

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.2 KiB
False
2751 
ValueCountFrequency (%)
False2751
100.0%
2025-11-25T00:05:45.899526image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Interactions

2025-11-25T00:05:41.168183image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.381471image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.946900image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.443136image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.945140image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.483082image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.979612image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.473998image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.980312image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.564786image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.061906image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.556279image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.034192image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.526683image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.201592image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.424850image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.982851image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.476776image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.977005image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.516807image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.012388image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.507018image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.015592image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.599113image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.096045image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.589842image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.068459image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.562519image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.236431image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.476357image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.020400image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.509940image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.006027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.552578image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.046377image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.543686image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.050070image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.633219image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.129944image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.623216image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.101852image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.599431image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.272401image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.516157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.054617image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.542970image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.040172image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.587904image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.082479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.580184image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.085683image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.670871image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.165956image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.658033image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.137694image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.659232image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.304218image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.562084image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.084379image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.575190image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.070608image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.619971image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.113862image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.613205image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.117853image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.702295image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.198117image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.688253image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.168045image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.693469image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.342593image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.610821image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.120039image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.613398image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.104356image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.656314image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.151664image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.651491image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.154902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.740489image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.234082image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.722773image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.204310image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.734132image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.377980image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.656706image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.156194image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.652274image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.228252image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.692653image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.186562image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.688259image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.191020image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.775412image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.269932image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.757895image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.240441image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.771843image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.414157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.690463image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.191053image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.688275image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.259589image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.726862image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.222896image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.723820image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.225861image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.811334image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.305261image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.792157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.275785image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.898227image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.449246image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.725222image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.225565image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.724682image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.290132image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.762527image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.257223image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.760201image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.260633image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.846242image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.341343image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.826467image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.311825image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.936459image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.485181image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.758911image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.260295image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.760204image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.320626image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.798764image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.292626image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.797637image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.295682image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.881371image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.376397image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.860511image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.347147image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.976222image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.520795image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.794198image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.297102image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.797540image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.351888image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.834893image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.327282image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.833646image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.331727image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.915928image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.412129image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.895596image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.384378image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.015469image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.556284image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.829335image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.331193image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.831983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.381981image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.868199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.360020image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.868329image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.364334image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.949760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.445048image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.926221image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.416911image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.050287image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.592653image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.869360image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.366284image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.867875image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.415027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.904701image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.396482image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.903110image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.399247image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.985017image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.480379image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.960000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.451847image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.088458image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.633087image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:34.912212image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.406903image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:35.907006image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.448845image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:36.942811image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.435459image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:37.944004image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:38.528393image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.024993image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.519828image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:39.998857image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:40.489995image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-11-25T00:05:41.128359image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-11-25T00:05:45.929177image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Age (at enrolment)CD4 cell count (cells/µL)HIV viral load (copies/mL)Sexcd4_correction_appliedclimate_14d_mean_tempclimate_30d_mean_tempclimate_7d_max_tempclimate_7d_mean_tempclimate_daily_max_tempclimate_daily_mean_tempclimate_daily_min_tempclimate_heat_day_p90climate_heat_day_p95climate_heat_stress_indexclimate_seasonclimate_standardized_anomalyclimate_temp_anomalymonthseasonyear
Age (at enrolment)1.000-0.130-0.0880.2000.0540.0270.0280.0230.0330.0120.0210.0280.0520.0520.0250.0410.005-0.0200.0190.0500.044
CD4 cell count (cells/µL)-0.1301.0000.0310.1681.0000.0070.004-0.003-0.0040.0380.0420.0180.0000.0000.0080.0000.0240.033-0.0160.0000.042
HIV viral load (copies/mL)-0.0880.0311.0000.0990.4880.0210.0370.0040.0420.0820.0970.0740.0000.0000.0350.0560.041-0.010-0.0530.0270.024
Sex0.2000.1680.0991.0000.0000.0000.0000.0510.0600.0000.0000.0420.0000.0000.0270.0000.0360.0000.0720.0290.058
cd4_correction_applied0.0541.0000.4880.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0080.000
climate_14d_mean_temp0.0270.0070.0210.0000.0001.0000.9780.9680.9120.8390.6500.5270.9970.9970.9830.9250.006-0.3450.4330.9250.997
climate_30d_mean_temp0.0280.0040.0370.0000.0000.9781.0000.9560.9120.8610.7140.6150.8490.8490.9540.7740.049-0.3700.4680.7740.849
climate_7d_max_temp0.023-0.0030.0040.0510.0000.9680.9561.0000.8710.8250.6140.4860.4170.4170.9410.793-0.022-0.3360.4960.7930.417
climate_7d_mean_temp0.033-0.0040.0420.0600.0000.9120.9120.8711.0000.7360.7070.7360.9980.9980.9160.6640.278-0.2060.3510.6640.998
climate_daily_max_temp0.0120.0380.0820.0000.0000.8390.8610.8250.7361.0000.8830.6470.9980.9980.8590.7600.203-0.0370.2880.7600.998
climate_daily_mean_temp0.0210.0420.0970.0000.0000.6500.7140.6140.7070.8831.0000.9000.9980.9980.6720.7380.5640.2200.1630.7380.998
climate_daily_min_temp0.0280.0180.0740.0420.0000.5270.6150.4860.7360.6470.9001.0000.6090.6090.5370.9410.7440.2660.0760.9410.609
climate_heat_day_p900.0520.0000.0000.0000.0000.9970.8490.4170.9980.9980.9980.6091.0000.9910.9970.6700.4360.9980.9960.6700.991
climate_heat_day_p950.0520.0000.0000.0000.0000.9970.8490.4170.9980.9980.9980.6090.9911.0000.9970.6700.4360.9980.9960.6700.991
climate_heat_stress_index0.0250.0080.0350.0270.0000.9830.9540.9410.9160.8590.6720.5370.9970.9971.0000.9330.037-0.2950.3770.9330.997
climate_season0.0410.0000.0560.0000.0000.9250.7740.7930.6640.7600.7380.9410.6700.6700.9331.0000.8170.7850.9141.0000.670
climate_standardized_anomaly0.0050.0240.0410.0360.0000.0060.049-0.0220.2780.2030.5640.7440.4360.4360.0370.8171.0000.782-0.4970.8170.436
climate_temp_anomaly-0.0200.033-0.0100.0000.000-0.345-0.370-0.336-0.206-0.0370.2200.2660.9980.998-0.2950.7850.7821.000-0.6890.7850.998
month0.019-0.016-0.0530.0720.0000.4330.4680.4960.3510.2880.1630.0760.9960.9960.3770.914-0.497-0.6891.0000.9670.387
season0.0500.0000.0270.0290.0080.9250.7740.7930.6640.7600.7380.9410.6700.6700.9331.0000.8170.7850.9671.0000.282
year0.0440.0420.0240.0580.0000.9970.8490.4170.9980.9980.9980.6090.9910.9910.9970.6700.4360.9980.3870.2821.000

Missing values

2025-11-25T00:05:41.712353image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-11-25T00:05:41.930285image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-11-25T00:05:42.084678image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

study_sourceprimary_dateyearmonthseasonlatitudelongitudejhb_subregioncityprovincecountryAge (at enrolment)SexCD4 cell count (cells/µL)HIV viral load (copies/mL)dateCountryClinical Study IDLocation of study follow-upcoordinate_sourcecoordinate_precisiongeographic_sourceHIV_statusjohannesburg_metro_validstudy_site_locationclimate_daily_mean_tempclimate_daily_max_tempclimate_daily_min_tempclimate_7d_mean_tempclimate_7d_max_tempclimate_14d_mean_tempclimate_30d_mean_tempclimate_temp_anomalyclimate_standardized_anomalyclimate_heat_day_p90climate_heat_day_p95climate_heat_stress_indexclimate_p90_thresholdclimate_p95_thresholdclimate_p99_thresholdclimate_seasonsa_biomarker_standardscd4_correction_appliedfinal_comprehensive_fix_appliedtotal_protein_extreme_flagdphru_053_final_corrections_appliedezin_002_final_corrections_appliedquality_harmonization_versionwaist_circ_unit_correction_applied
3377JHB_Aurum_0092014-02-152014.02.0Summer-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa24.0Female369.00.02014-02-15South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
3378JHB_Aurum_0092014-04-092014.04.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa38.0Female701.0NaN2014-04-09South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
3379JHB_Aurum_0092014-08-122014.08.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa21.0Male654.0NaN2014-08-12South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
3380JHB_Aurum_0092014-04-292014.04.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa29.0Male350.0NaN2014-04-29South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
3381JHB_Aurum_0092013-04-292013.04.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa35.0Female324.00.02013-04-29South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)17.79925.80010.49316.47126.76116.05715.77510.0251.0740.00.019.95828.40929.70431.797Autumn1.00.01.00.00.00.02.0False
3382JHB_Aurum_0092014-06-262014.06.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa22.0Male276.0NaN2014-06-26South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
3383JHB_Aurum_0092013-11-192013.011.0Spring-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa38.0FemaleNaNNaN2013-11-19South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)19.29326.34311.25319.03829.70419.06918.8547.4890.0070.00.021.52328.40929.70431.797Spring1.00.01.00.00.00.02.0False
3384JHB_Aurum_0092014-09-082014.09.0Spring-25.747928.2293Eastern_JHBJohannesburgGautengSouth AfricaNaNMaleNaNNaN2014-09-08South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
3385JHB_Aurum_0092013-08-242013.08.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa22.0Female525.0NaN2013-08-24South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)9.35617.5532.3439.21517.72110.42610.6356.918-1.1890.00.013.63928.40929.70431.797Winter1.00.01.00.00.00.02.0False
3386JHB_Aurum_0092014-03-242014.03.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa42.0Male287.0NaN2014-03-24South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
study_sourceprimary_dateyearmonthseasonlatitudelongitudejhb_subregioncityprovincecountryAge (at enrolment)SexCD4 cell count (cells/µL)HIV viral load (copies/mL)dateCountryClinical Study IDLocation of study follow-upcoordinate_sourcecoordinate_precisiongeographic_sourceHIV_statusjohannesburg_metro_validstudy_site_locationclimate_daily_mean_tempclimate_daily_max_tempclimate_daily_min_tempclimate_7d_mean_tempclimate_7d_max_tempclimate_14d_mean_tempclimate_30d_mean_tempclimate_temp_anomalyclimate_standardized_anomalyclimate_heat_day_p90climate_heat_day_p95climate_heat_stress_indexclimate_p90_thresholdclimate_p95_thresholdclimate_p99_thresholdclimate_seasonsa_biomarker_standardscd4_correction_appliedfinal_comprehensive_fix_appliedtotal_protein_extreme_flagdphru_053_final_corrections_appliedezin_002_final_corrections_appliedquality_harmonization_versionwaist_circ_unit_correction_applied
6118JHB_Aurum_0092013-07-172013.07.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa23.0Male174.0NaN2013-07-17South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)13.86821.3477.43612.78121.52012.25811.07610.2711.7810.00.014.30628.40929.70431.797Winter1.00.01.00.00.00.02.0False
6119JHB_Aurum_0092013-06-062013.06.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa36.0Male110.0NaN2013-06-06South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)13.65621.4746.03410.79321.97711.53211.6359.8391.6040.00.013.42828.40929.70431.797Winter1.00.01.00.00.00.02.0False
6120JHB_Aurum_0092014-06-172014.06.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa29.0Male393.00.02014-06-17South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
6121JHB_Aurum_0092014-02-032014.02.0Summer-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa34.0Female202.0NaN2014-02-03South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
6122JHB_Aurum_0092014-04-292014.04.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa34.0Female31.0NaN2014-04-29South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
6123JHB_Aurum_0092014-04-232014.04.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa31.0Male365.0NaN2014-04-23South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
6124JHB_Aurum_0092013-08-272013.08.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa31.0Female586.0NaN2013-08-27South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)9.35617.5532.3439.21517.72110.42610.6356.918-1.1890.00.013.63928.40929.70431.797Winter1.00.01.00.00.00.02.0False
6125JHB_Aurum_0092014-08-142014.08.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa65.0Male409.0NaN2014-08-14South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
6126JHB_Aurum_0092014-08-042014.08.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa28.0Male455.0NaN2014-08-04South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False
6127JHB_Aurum_0092013-11-162013.011.0Spring-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa23.0Male300.0NaN2013-11-16South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)19.29326.34311.25319.03829.70419.06918.8547.4890.0070.00.021.52328.40929.70431.797Spring1.00.01.00.00.00.02.0False

Duplicate rows

Most frequently occurring

study_sourceprimary_dateyearmonthseasonlatitudelongitudejhb_subregioncityprovincecountryAge (at enrolment)SexCD4 cell count (cells/µL)HIV viral load (copies/mL)dateCountryClinical Study IDLocation of study follow-upcoordinate_sourcecoordinate_precisiongeographic_sourceHIV_statusjohannesburg_metro_validstudy_site_locationclimate_daily_mean_tempclimate_daily_max_tempclimate_daily_min_tempclimate_7d_mean_tempclimate_7d_max_tempclimate_14d_mean_tempclimate_30d_mean_tempclimate_temp_anomalyclimate_standardized_anomalyclimate_heat_day_p90climate_heat_day_p95climate_heat_stress_indexclimate_p90_thresholdclimate_p95_thresholdclimate_p99_thresholdclimate_seasonsa_biomarker_standardscd4_correction_appliedfinal_comprehensive_fix_appliedtotal_protein_extreme_flagdphru_053_final_corrections_appliedezin_002_final_corrections_appliedquality_harmonization_versionwaist_circ_unit_correction_applied# duplicates
0JHB_Aurum_0092013-07-152013.07.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa23.0MaleNaNNaN2013-07-15South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)13.86821.3477.43612.78121.5212.25811.07610.2711.7810.00.014.30628.40929.70431.797Winter1.00.01.00.00.00.02.0False2
1JHB_Aurum_0092014-03-292014.03.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa32.0FemaleNaNNaN2014-03-29South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False2
2JHB_Aurum_0092014-04-022014.04.0Autumn-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa49.0MaleNaNNaN2014-04-02South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False2
3JHB_Aurum_0092014-08-122014.08.0Winter-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa39.0MaleNaNNaN2014-08-12South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False2
4JHB_Aurum_0092014-10-282014.010.0Spring-25.747928.2293Eastern_JHBJohannesburgGautengSouth Africa37.0FemaleNaNNaN2014-10-28South AfricaTholimpilo_HIV_Linkage_StudyAurum Institute - Multi-site Gauteng and LimpopoJHB_Aurum_009highharmonized_datasetsPositive1.0Tembisa/East Rand (Aurum Institute)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.00.01.00.00.00.02.0False2